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1.  INTRODUCTION 

This  document  provides  a  detailed  description  of  the  lexicon  design  Implemented 
on  the  IBM  360/67  computer  for  the  SDC  version  of  the  Vlcens-Reddy  Speech 
Recognition  System.  The  Vlcens-Reddy  System  handled  a  maximum  lexicon  of 
1,000  entries  contained  In  900,000  36-bit  words  of  core.  Not  having  this  size 
core  available  under  TS/DMS,  and  looking  ahead  to  future  lexicons  In  excess  of 
1,000  entries,  we  have  designed  a  means  of  storing  the  lexicon  on  disc.  This 
design  will  allow  rapid  access  to  a  large  lexicon. 

The  programming  necessary  for  setting  up  a  new  lexicon,  Inserting  lexicon  samples, 
and  retrieving  those  already  Inserted  was  accomplished  by  additions  and  modifica¬ 
tions  to  the  checked-out  Vicens-Reddy  Segmentation-Recognition  System  on  the 
360/67.  This  new  system  is  referred  to  as  CWIPER.*  The  entire  sy.tem,  except 
for  one  subroutine  (the  first  and  last  character  hash),  Is  programmed  in 
FORTRAN  IV,  Version  G.  CWIPER  was  compiled  in  three  sections  and  linked  through 
the  TS/DMS  link  editor  to  run  under  TS/DMS  on  the  360/67. 

CWIPER  can  select  a  group  of  possible  candidates  for  a  speech  sample  match.  The 
mapping  and  evaluation  routines  necessary  to  select  the  best  candidate  will  be 
Implemented  in  the  near  future. 

2.  LEXICON  DISC  DESIGN 

The  lexicon  Is  a  TS/DMS  S-l  file**  residing  on  a  2314  disc  pack.  It  Is  composed 
of  fixed- length  records,  each  8,192  bytes  long.  The  fixed- length  record  size  is 
necessary  for  defining  the  lexicon  as  a  direct-access  data  set  under  FORTRAN  IV 
(G  level).  A  record  size  of  8,192  bytes  was  chosen  because  it  Is  the  length  of 
two  core  pages. 


* Con textual  Word  In  Phrase  Extraction  Routine 

**A  single-volume,  sequential,  variable-record-size  file 
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The  2314  disc  pack  contains  203  physical  cylinders  of  20  tracks  each.  Under 
TS/DMS,  each  physical  cylinder  contains  4  logical  cylinders  of  5  tracks  each. 
Three  of  the  physical  cylinders  are  used  as  alternatives  should  any  tracks 
become  defective.  Physical  cylinder  0  Is  used  by  the  TS/DMS  cataloger  and 
contains  information  about  files  that  have  been  created  on  the  pack.  The 
maximum  space  available  to  a  user  of  a  2314  disc  pack,  then,  Is  199*4  logical 
cylinders,  each  containing  5  tracks. 

The  maximum  number  of  lexicon  entries  will  be  limited  to  65,535  (2^*-l)  entries. 
Each  entry  necessitates  a  16-byte  entry  in  the  LXPART  table  and  a  variable-length 
entry  in  the  LXALL  table. 


The  number  of  LXPART  entries  per  record  is  calculated  as  follows: 


8,192  bytes  per  record 
16  bytes  per  entry 


512  entries  per  record 


The  number  of  disc  records  needed  to  contain  a  full  LXPART  is: 


65,535  total  number  of  entries 
512  entries  per  record 


128  disc  records 


The  average  length  of  a  LXALL  entry  is  360  bytes.  (The  maximum  length  is  767 
bytes.)  The  number  of  aver'ge  LXALL  entries  per  record  is  calculated  to  be: 
8,192  bytes  per  record 

160  byte,  per  entry  *  22  entries  per  ncoti 


The  number  of  disc  records  needed  to  contain  a  full  LXALL  of  average  size 
entries  is: 


65,535  total  number  of  entries 
22  entries  per  record 


2,979  disc  records 


The  total  number  of  bytes  available  on  a  2314  disc  is  199  physical  cylinders  •  4 
logical  cylinders  •  5  tracks  *  7,231  bytes  per  track  ■  28,779,380  bytes.  The 
approximate  number  of  bytes  needed  for  65,535  average  lexicon  entries  is 
27,000,000,  which  includes  164  bytes  per  record  for  SPAM  usage  and  184  bytes 
per  logical  cylinder  for  SPAM  usage. 
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3.  CORE  LAYOUT  FOR  THE  INUSE  LEXICON 

Three  main  core  blocks  are  used  for  the  lexicon.  COMLEX  Is  the  lexicon  control 
area,  CORPT  Is  the  LXPART  area,  and  CORALL  Is  the  LXALL  area. 

3.1  COMLEX 

COMLEX  is  defined  as  a  2,048-word  block  located  In  COMMON/COMLX/.  It  contains 
all  the  control  information  for  lexicon  handling  and  Is  the  first  lexicon  record 
on  the  disc.  It  contains  the  following  tables  and  items  in  sequence: 

V7HASH  A  table  of  256  half-word  Integer  pointers  that  make  up  the 

vowel-fricative  hash  table. 

CHHASH  A  table  of  676  half-word  integer  pointers  that  make  up  the 

first-last  character  hash  table. 

LXPPT  A  table  of  128  half-word  Integer  pointers  that  point  to  LXPART 

records. 

LXUSE  A  table  of  128  half-word  Integer  Indicators  representing  the 

core  status  of  corresponding  LXPPT  entries.  A  non-zero  LXUSE 
entry  Indicates  the  block  of  CORPT  in  which  the  LXPART  record 
resides. 

COREA  A  full-word  Integer  containing  the  record  number  of  the  LXALL 

record  currently  in  core  at  CORALL. 

CORES  A  table  of  16  half-word  integer  entries.  Each  entry  represents 

a  block  of  the  CORPT  table  from  1  to  NBUF  (see  Section  3.2)  and 
is  a  pointer  to  the  LXPPT  and  LXUSE  table  entries  for  the 
LXPART  record  currently  in  that  block. 

NREC  A  full-word  integer  containing  the  number  of  the  next  available 

record  to  be  assigned  to  a  new  LXPART  or  LXALL  record. 

NLEXNO  A  full-word  integer  containing  the  next  number  for  assignment 

as  a  lexicon  entry  number. 

NALLRC  A  full-word  integer  containing  the  record  number  of  the  last 

incomplete  LXALL  record. 

NALLWD  A  full-word  integer  containing  the  word  position  at  which  the 

next-to-be-added  LXALL  entry  would  begin  in  the  HALLRC  record. 

NRFLAG  A  full-word  logical  that  is  a  flag  set  in  the  LOOK  subroutine 

and  tested  by  the  INSERT  subroutine.  If  .TRUE,  then  a  new 
LXALL  record  is  needed;  if  .FALSE,  then  a  current  LXALL  record 
(NALLRC)  can  be  usei. 
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CHTEMP 


VFTEMP 


3.2  CORPT 

CORPT  Is  defined  as  a  NBUF. 2,048-word  Integer  table  located  In  COMMON/COMCOR/ 
CORALL,  CORPT.  NBUF  Is  defined  as  a  full-word  Integer  and  Is  contained  In  the 
COMMON/COMDAT/  area  of  the  Main  program  and  all  subroutines.  It  Is  currently 
set  to  2  In  BLOCK  DATA.  NBUF  indicates  the  number  of  2,048-word  blocks  avail¬ 
able  for  the  simultaneous  loading  of  LXPART  records.  The  minimum  site  of  NBUF 
la  1  and  the  maximum  size  Is  16.  (The  maximum  Is  determined  by  the  size  of  the 
CORES  table;  NBUF  may  exceed  the  maximum  if  the  CORES  table  Is  Increased 
correspondingly. ) 

C0RPT2  Is  defined  as  an  NBUF *4096  half-word  table  and  Is  equivalent  to  CORPT. 

It  is  defined  for  ease  In  referencing  half-word  fields  In  FORTRAN. 

3.3  CORALL 

CORALL  Is  defined  as  a  2,048-word  integer  table  located  in  the  LXALL  storage 
area. 

C0RAL2  is  defined  as  a  4, 096-half -word  integer  table  and  Is  equivalent  to 
CORALL.  It  Is  defined  for  ease  1&  referencing  half-word  fields  in  FORTRAN. 

C0RAL4  Is  defined  as  an  8,192-byte  LOGICAL*!  table  and  is  equivalent  to  C0RAL2 
and  CORALL,  It  Is  defined  for  ease  in  referencing  logical  bytes  in  FORTRAN. 


A  full-word  Integer  containing  the  number  of  the  first  and 
last  character  hash  of  the  print  name  of  the  current  speech 
sample  to  be  Inserted  in  the  lexicon. 

A  full-word  Integer  containing  the  number  of  vowels-number  of 
fricatives  hash  of  the  current  speech  sample. 


28  May  1971 


5 


System  Development  Corporation 
TM-4652/400/00 


* 


4.  DETAILED  DESCRIPTION  OF  THE  LEXICON  TABLES 

4.1  VFHASH  TABLE 

VFHASH  table  is  the  vowel- fricative  hadi  table.  The  table  consists  of  256  half¬ 
word  integers.  An  entry  address  in  this  table  is  computed  by: 

16  *  FEATURES  MATRIX  Vowel  count  +  features  matrix  fricative 
count  +  1. 


The  16-bit  entry  (if  non-zero)  points  to  a  four-word  entry  in  the  LXPART  table 
by  lexicon  entry  number.  It  is  the  lexicon  entry  number  of  the  last  entry 
having  the  same  hash  code.  When  a  new  lexicon  entry  is  inserted ,  the  lexicon 
entry  number  found  in  the  VFHASH  entry  is  used  to  reference  its  LXPART  entry 
and  change  its  next  VFHASH  pointer  from  zero  (the  VFHASH  table)  to  NLEXNO 
(the  lexicon  number  of  the  current  sample  to  be  inserted).  The  LXPART  entry 
for  the  new  sample  (NLEXNO)  will  use  the  lexicon  number  found  in  the  VFHASH 
entry  as  its  previous  VFHASH  pointer  and  zero  (the  VFHASH  table)  as  its  next 
VFHASH  pointer.  The  VFHASH  table  entry  will  be  set  to  NLEXNO. 


VFHASH (i) 


LEXN050 


PREVIOU^ 

^NEXT 


LEXN020 


PREVIOUg 

.NEXT 


LEXN05 


where: 


VFHASH(i)  is  50. 

The  next  VFHASH  pointer  for  LEXN050  is  0. 

The  previous  VFHASH  pointer  for  LEXN050  is  20. 
The  next  VFHASH  pointer  for  LEXN020  is  50. 

The  previous  VFHASH  pointer  for  LEXN020  is  5. 
The  next  VFHASH  pointer  for  LEXN05  is  20. 

The  previous  VFHASH  pointer  for  LEXN05  is  0. 


The  vowel-fricative  hash  is  based  on  a  maximum  of  15  vowels  and  15  fricatives 
in  a  speech  sample.  The  hash  is  based  on  the  low  order  of  bits,  and  a  wrap¬ 
around  occurs  if  there  are  more  than  15  vowels  or  15  fricatives  in  a  speech 
sample. 
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Entry 

Number 

1 

2 

3 

4 

5 

6 

7 

8 
9 

10 

11 

12 

13 

14 

15 

16 
17 


VFHASH  Table  Layout 

16*V  +  F  +  1 

16*0  +  0  +  1 
16*0  +1+1 
16*0  +2  +  1 
16*0  +3  +  1 
16*0  +4  +  1 
16-0  +5+1 
16*0  +  6+1 
16*0  +7+1 
16*0  +8+1 
16*0  +9+1 
16*0  +  10+1 
16*0  +11  +  1 
16*0  +  12  +  1 
16*0  +13+1 
16*0  +14+1 
16*0  +15  +  1 
16*1  +  0+1 


32  16*1  +15+1 

48  16.2  +15  +  1 

256  16  *15  +  15  +  1 

(where  V  is  the  features-matrix  vowel  count  and  F  is  the  features-matrix 
fricative  count) 
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4.2  CHHASH  TABLE 

CHHASH  table  is  the  first-last  character  hash  of  the  print  name  (i.e.,  the  first 
and  last  alphabetic  characters).  The  table  consists  of  676  half-word  integers. 
Each  lexicon  entry  has  a  print-name  area  that  can  be  a  maximum  of  255  characters 
long.  A  hash  of  the  first  and  lasc  letters  yields  an  entry  into  the  table; 

26  •  converted  1st  letter  +  converted  last  letter  +  1. 


LETTER 

EBCDIC  REPRESENTATION 

CONVERTED  NUMBER 

A 

Cl 

0 

B 

C2 

1 

C 

C3 

2 

D 

C4 

3 

E 

C5 

4 

F 

C6 

5 

G 

C7 

6 

H 

C8 

7 

I 

C9 

8 

J 

D1 

9 

K 

D2 

10 

L 

D3 

11 

M 

D4 

12 

N 

D5 

13 

0 

D6 

14 

P 

D7 

15 

Q 

D8 

16 

R 

D9 

17 

S 

E2 

18 

T 

E3 

19 

U 

E4 

20 

V 

E5 

21 

W 

E6 

22 

X 

E7 

23 

Y 

E8 

24  ' 

Z 

E9 

25 
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The  conversion  rules  are: 
for: 

Cl  ^  character  £  C9,  character  ■  character  -Cl 

for: 

D1  s  character  £  D9,  character  ■  character  -D1  +  9 

for: 

E2  s  character  ^  E9,  character  ■  character  -E2  +  18 


The  16-blt  entry  In  the  CHHASH  table  pointed  to  by  the  first  and  last  letter 
hash  Is  a  pointer  to  a  4-word  entry  In  the  LXPART  table  (i.e.,  a  lexicon  entry 
number).  It  Is  the  lexicon  entry  number  of  the  last  entry  having  the  same  hash 
code.  When  a  new  lexicon  entry  Is  Inserted,  the  lexicon  entry  number  found  In 
the  CHHASH  entry  Is  used  to  reference  Its  LXPART  entry  and  change  Its  next  CHHASH 
pointer  from  zero  (the  CHHASH  table)  to  NLEXNO  (the  lexicon  number  of  the  current 
sample  to  be  Inserted).  The  LXPART  entry  for  the  new  sample  (NLEXNO)  will  use 
the  lexicon  number  found  In  the  CHHASH  entry  as  Its  previous  CHHASH  pointer  and 
zero  (the  CHHASH  table)  as  Its  next  CHHASH  pointer.  The  CHHASH  table  entry  will 
be  set  to  NLEXNO. 


CHHASH(l) 


LEXN050 


PREY^ 

^NEXT 


LEXN020 


LEXN05 


where:  CHHASH(l)  Is  50. 

The  next  CHHASH  pointer  for  LEXN050  Is  0. 

The  previous  CHHASH  pointer  for  LEXN050  Is  20. 
The  next  CHHASH  pointer  for  LEXN020  Is  50. 

The  previous  CHHASH  pointer  for  LEXN020  Is  5. 
The  next  CHHASH  pointer  for  LEXN05  Is  20. 

The  previous  CHHASH  pointer  for  LEXN05  Is  0. 
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CHHASH  Table  Layout 


entry 

number 

1st  letter 

last  letter 

26- 

first  letter  +  last  letter  +  1 

1 

A 

A 

26* 

0 

+ 

0  +  1 

2 

A 

B 

26- 

0 

+ 

1  +  1 

3 

A 

C 

26- 

0 

+ 

2  +  1 

4 

A 

D 

26* 

0 

+ 

3  +  1 

5 

A 

E 

26- 

0 

+ 

4  +  1 

6 

A 

F 

26* 

0 

+ 

5  +  1 

7 

A 

G 

26* 

0 

+ 

6  +  1 

8 

A 

H 

26- 

0 

+ 

7  +  1 

9 

A 

I 

26- 

0 

+ 

8  +  1 

10 

A 

J 

26- 

0 

+ 

9  +  1 

11 

A 

K 

26* 

0 

+ 

10+1 

12 

A 

L 

26* 

0 

+ 

11  +  1 

13 

A 

M 

26' 

0 

+ 

12  +  1 

14 

A 

N 

26- 

0 

+ 

13+1 

15 

A 

0 

26* 

0 

+ 

14+1 

• 

• 

• 

• 

• 

■  • 

• 

• 

• 

• 

• 

•  • 

26 

A 

Z 

26- 

0 

+ 

25  +  1 

52 

B 

Z 

26' 

1 

+ 

25+1 

• 

• 

• 

• 

• 

•  • 

• 

• 

• 

• 

• 

•  • 

• 

• 

• 

• 

• 

•  • 

676 

Z 

z 

26- 

25  +  25  +  1 
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4.3  LXPPT  TABLE 

LXPPT  table  Is  a  table  pointing  to  LXPART  records.  It  contains  128  half-word 
pointers.  Each  pointer  points  to  a  LXPART  record  that  contains  a  maximum  of 
512  four-word  entries  for  a  range  of  lexicon  entry  numbers. 

For  example:  LXPPT(l)  points  to  the  first  LXPART  record,  which  can  contain 
entries  for  lexicon  numbers  1  through  512;  LXPPT(2)  points  to  the  second  LXPART 
record,  which  can  contain  entries  for  lexicon  numbers  513  through  1,024;  etc., 
for  a  total  of  128  LXPPT  entries  given  a  maximum-size  lexicon  (65,535  entries). 

If  the  lexicon  Is  uot  large  enough  to  fill  all  the  LXPART  records,  then  the 
LXPPT  pointers  to  unused  LXPART  records  will  be  zero. 

The  method  of  finding  the  correct  LXPPT  entry  and,  within  the  polnted-to  record, 
the  correct  LXPART  entry.  Is: 

PPT  ■  Integer  Part  -j  4-  1  »  entry  In  LXPPT  table 

+  1  ■  the  LXPART  entry  number  In 

the  record  pointed  to  by  LXPPT 

4.4  LXUSE  TABLE 

LXUSE  table  contains  128  half-word  Integer  entries.  Each  entry  corresponds  to 
a  LXPPT  entry  and  Indicates  the  core  status  of  the  respective  LXPART  record. 
LXUSE (1)  ■  0  ...  16;  if  it  Is  not  zero,  then  It  contains  the  core-block  number 
of  CORPT  that  currently  contains  the  LXPART  record  pointed  to  by  LXPPT. 

LXUSE (1)  Is  an  index  to  CORES  and  vice  versa: 

CORES  (LXUSE  (1))  -  1  CORES  (2)  -  3 

LXUSE  (CORES (j))  -  j  LXUSE (3)  -  2 


_  „  . .  ,  „  /lexicon  number-1 

ENPFT  ■  Fractional  Part  I - - 
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Thus  LXUSE(3)  ■  2  means  that  the  third  logical  record  of  LXPART  Is  currently  in 
core  beginning  at  CORPT  (2049).  If  we  are  checking  to  see  what  core  is  available, 
CORES (2)  -  3  tells  us  that  the  second  block  of  CORPT  is  currently  occupied  by  the 
third  logical  record  of  LXPART.  The  physical  record  number  of  the  third  logical 
record  of  LXPART  is  given  by  LXPPT(3). 

4.5  LXPART  TABLE 

For  each  lexicon  entry  there  is  a  16-byte  LXPART  entry.  The  LXPART  entry 
identifies  the  record  number  and  starting  byte  of  the  larger  LXALL  entry.  It 
also  contains  pointers  to  the  previous  and  next  entries  with  the  same  VFHASH, 
pointers  to  the  previous  and  next  entries  with  the  same  CHHASH,  and  the  vowel- 
fricative  map  in  binary  that  represents  the  lexicon  entry. 


16-byte 

entry 


Pointer  to  previous  entry 

Pointer  to  previous  entry 

with  same  CHHASH 

with  same  VFHASH 

Pointer  to  next  entry 

Pointer  to  next  entry 

with  same  CHHASH 

with  same  VFHASH 

Vowel-Fricative  Pattern  Binary 

Record  number  of 

Starting  word  of 

LXALL  entry 

LXALL  within  record 

The  vowel-fricative  pattern  is  right  justified. 
Each  vowel  ■  0^ 

Each  fricative  ■  102 
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A  pattern — for  example,  of  VFFVVF — Is  given  In  binary  and  hexadecimal  as  follows: 


V  F  F  V  V  F 

binary:  0000  0000  0000  0000  0000  0110  1001  0110 

hexadecimal:  00  0006  96 

The  purpose  of  the  LXPART  table  Is  l:o  be  able  to  quickly  accumulate  a  subset 
of  lexicon  entries  that  are  possible  matches  for  a  particular  speech  sample. 

The  vowel-fricative  count  of  the  speech  sample  Is  used  to  reference  a  VFHASH 
entry  that  (If  ncn-zero)  gives  the  lexicon  entry  number  of  the  last  lexicon 
entry  having  the  same  VFHASH  code.  This  lexicon  entry  number  Is  first  used  In 
referencing  the  LXPPT  and  LXUSE  tables  to  be  sure  the  needed  LXPART  record  is 
In  core.  When  the  LXPART  record  Is  In  core,  the  proper  lexicon  entry  Is  found 
and  its  vowel- fricative  pattern  la  matched  with  that  of  the  speech  sample.  If 
it  matches,  then  it  is  saved  on  the  stack  as  a  possible  match.  Regardless  of 
a  match,  the  LXPART  pointer  to  the  previous  entry  with  ’•he  same  VFHASH  yields 
another  lexicon  entry  number,  etc.  For  a  small  lexicon  (less  than  1,024  entries), 
the  entire  LXPART  table  can  be  core  resident. 

4.6  LXALL  TABLE 

The  LXALL  entry  Is  the  compact  features  matrix  of  a  speech  sample,  plus 
identification,  lexicon  entry  number,  print  name  In  EBCDIC,  number  of  characters 
In  the  print  name,  and  some  reserved  space  for  later  expansion  (perhaps  in  the 
syntactic  area). 

The  maximum  size  of  a  LXALL  entry  Is: 

General  Information 

8  bytes  per  row  for  maximum  of  60  rows* 

Print  name,  maximum  255  bytes 

767  bytes 


32  bytes 
480  bytes 
255 


A 

The  maximum  number  of  rows  in  the  Vicens-Reddy  Features  Matrix  (1968)  is  60; 
however,  this  lexicon  design  would  allow  up  to  255  (2®-l)  rows. 


28  May  1971 


13 


System  Development  Corporation 
TM-4652/400/00 


In  estimating  storage,  the  average  size  of  360  bytes  was  used.  This  was  based 
on  the  Vicens-Reddy  average  of  90  words  per  lexicon  entry. 

4.7  THE  LXALL  ENTRY  LAYOUT 


0  -  UNUSED  FIELD 

NO.  of  characters 
In  print  name 

LEXICON  ENTRY  NUMBER 

IDSESS  -  SESSION  NUMBER 

SAMPLE  -  SAMPLE  NUMBER 

MANNO  -  MAN  NUMBER 

VERSNl  -  Preprocessing  Version  No. 

VERSN2  -  Segmentation  Version  No. 

VERSN3  -  Recognition  Version  No. 

VERSN4  »  Mapping  Version  No. 

0  -  Unused  field 

Beg.  Q  Matrix  Segment  Number  -1 

0  **  Unused  field 

Vowel  Count 

Pricatlve  Count 

Row  Count  +  1 

Vowel  1  Row  No.  +  1 

Vowel  2  Row  No.  +  1 

Vowel  3  Row  No.  +  J. 

Vowel  4  Row  No.  +  1 

Vowel  5  Row  No.  +  1 

warn 

DUR1 

“i 

Z1i 

A21 

z2i 

"i 

“i 

■  •  •  • 

.  Two  Words  for  each  row  of  the  features  matrix 

•  •  •  t 

Kg! 

TYPE1 

DURt 

A1i 

zli 

A21 

Z2 

1 

“i 

z3i 

PRINT  NAME  II 
FOR  MAXIMUM  OF  2! 

I  EBCDIC 
>5  CHARACTERS 

> - - 

/  \ - - 

'ri*TrtTTIfti>tirif  lln  lIllWTTTlfiTjlriir TiTf rtWITilTUlfirr  " 
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Where: 

Row  Count  *1+1 

SXT  ,  ■  if  not  a  local  minimum 

"  2,,  if  a  local  minimum  (the  local  maximitms  were  removed  in 
16 

the  recognition  process) 

Note:  If  the  lexicon  entry  has  been  entered  by  the  user  from  the  terminal  and 
not  from  a  live  or  recorded  speech  sample,  the  SESSNO  ■  0. 

5.  METHOD  OF  CHANGING  LEXICON  DIMENSIONS 

C0MM0N/C0MSET /MAXNO ,  MAXREC,  MAXPTR,  MAXWDR  contains  all  the  variables  affecting 
the  lexicon  size  and  usage  except  for  NBUF  (the  number  of  2, 048-word  blocks 
available  for  the  simultaneous  loading  of  LXPART  records).  MAXNO  is  the  maximum 
number  of  lexicon  entries,  MAXREC  is  the  maximum  number  of  records  in  the 
lexicon,  MAXPTR  is  the  maximum  number  of  entries  in  a  LXPART  record,  and  MAXWDR 
is  the  maximum  number  of  words  in  a  LXPART  or  LXALL  record.  These  are  currently 
initialized  in  BLOCK  DATA  as  follows: 

NBUF  -  2 
MAXNO  -  1024 
MAXREC  -  60 
MAXPTR  -  512 
MAXWDR  -  2048 

The  dimensions  of  the  lexicon  can  be  changed  by  redefining  these  five  integer 
variables  in  the  BLOCK  DA*A  program.  Remember,  however,  that  when  the  new 
lexicon  is  opened  (and  then  after),  the  size  of  the  file  parameter  and  the 
record  size  in  bytes  must  correspond  to  the  new  definitions. 
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For  example,  the  current  lexicon  Is  opened  by: 

LEXICON  V41003  800000  F  R8192 
where:  "LEXICON"  is  the  file  name, 

"V41003"  Is  the  disc-pack  designation, 
"800000"  is  the  byte-size  of  the  file, 
"F"  for  fixed-record  size,  and 

"R8192"  is  the  byte-size  of  the  record. 


6. 


INITIALIZING  THE  LEXICON 


When  CWIPER  asks  the  user  "NEW  LEXICON?"  and  the  response  is  "YES,"  it  then 
asks  for  the  new  lexicon  file  description  and  opens  up  the  new  lexicon  file. 
The  C0MLEX  core  area  is  cleared  to  zeros  except  for  the  following  settings: 

NLEXN0  -  1  Lexicon  entry  number  1  will  be  the  first  assigned  lexicon 

number . 

NALLRC  ■  3  The  current  LXALL  record  will  be  record  3. 

NALLWD  «  1  The  first  LXALL  entry  will  begin  at  word  1  of  record  3. 

COREA  -  3  The  LXALL  record  3  is  considered  to  be  in  core. 

LXPPT(l)  ■  2  The  first  logical  LXPART  record  is  physical  record  2. 

LXUSE(l)  ■  1  The  LXPART  logical  record  1  is  in  the  first  block  of 

CORPT  in  core. 

CORES (l)  ■  1  The  first  block  of  CORPT  contains  the  physical  record 
pointed  to  by  LXPPT(l). 

NREC  ■  4  The  next  record  to  be  assigned  for  LXALL  or  LXPART  will 

be  physical  record  4. 

COMLEX  is  then  written  out  as  the  first  record  of  the  new  lexicon  file. 


